Assessing Relative Sentence Complexity using an Incremental CCG Parser
نویسندگان
چکیده
Given a pair of sentences, we present computational models to assess if one sentence is simpler to read than the other. While existing models explored the usage of phrase structure features using a non-incremental parser, experimental evidence suggests that the human language processor works incrementally. We empirically evaluate if syntactic features from an incremental CCG parser are more useful than features from a non-incremental phrase structure parser. Our evaluation on Simple and Standard Wikipedia sentence pairs suggests that incremental CCG features are indeed more useful than phrase structure features achieving 0.44 points gain in performance. Incremental CCG parser also gives significant improvements in speed (12 times faster) in comparison to the phrase structure parser. Furthermore, with the addition of psycholinguistic features, we achieve the strongest result to date reported on this task. Our code and data can be downloaded from https://github. com/bharatambati/sent-compl.
منابع مشابه
Improving the Efficiency of a Wide-Coverage CCG Parser
The C&C CCG parser is a highly efficient linguistically motivated parser. The efficiency is achieved using a tightly-integrated supertagger, which assigns CCG lexical categories to words in a sentence. The integration allows the parser to request more categories if it cannot find a spanning analysis. We present several enhancements to the CKY chart parsing algorithm used by the parser. The firs...
متن کاملIntegrated Supertagging and Parsing
Parsing is the task of assigning syntactic or semantic structure to a natural language sentence. This thesis focuses on syntactic parsing with Combinatory Categorial Grammar (CCG; Steedman 2000). CCG allows incremental processing, which is essential for speech recognition and some machine translation models, and it can build semantic structure in tandem with syntactic parsing. Supertagging solv...
متن کاملIncremental Semantic Construction Using Normal Form CCG Derivation
This paper proposes a method of incrementally constructing semantic representations. Our method is based on Steedman’s Combinatory Categorial Grammar (CCG), which has a transparent correspondence between the syntax and semantics. In our method, a derivation for a sentence is constructed in an incremental fashion and the corresponding semantic representation is derived synchronously. Our method ...
متن کاملMink: An Incremental Data-Driven Dependency Parser with Integrated Conversion to Semantics
While there are several data-driven dependency parsers, there is still a gap with regards to incrementality. However, as shown in Brick and Scheutz [3], incremental processing is necessary in human-robot interaction. As is shown in Nivre et al. [12], dependency parsing is well-suited for mostly incremental processing. However, there is as of yet no dependency parser that combines syntax and sem...
متن کاملWide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models
This paper describes a number of log-linear parsing models for an automatically extracted lexicalized grammar. The models are “full” parsing models in the sense that probabilities are defined for complete parses, rather than for independent events derived by decomposing the parse tree. Discriminative training is used to estimate the models, which requires incorrect parses for each sentence in t...
متن کامل